Abstract
Introduction Treatment-free remission (TFR) remains unpredictable in chronic myeloid leukemia (CML) management, with half of patients relapsing after tyrosine kinase inhibitor (TKI) cessation due to persistent, difficult-to-isolate leukemic stem cells (LSCs). Current predictors, such as TKI therapy duration, deep molecular response (DMR) length, or BCR::ABL1 halving kinetics, are insufficient to precisely identify which patients harbor relapse-fated clones at diagnosis. Critically, no tools currently allow for relapse risk stratification at diagnosis, thereby limiting opportunities for early pre-emptive intervention. While previous studies have suggested that relapse biology may be preconfigured in diagnostic LSCs, specific molecular signatures predictive of future relapse have not been clearly defined.
Methodology To address this gap, we performed single-cell RNA sequencing (scRNA-seq) on a rare, clinically well-annotated cohort (n=52) of diagnostic bone marrow (BM) [Australia, n=20 and Singapore, n=4)] samples and peripheral blood (PB) [Australia, n=18 and Japan, n=10)] from CML patients stratified by TFR outcome. This approach enabled retrospective linkage of stem cell transcriptional states to long-term clinical fate. Such diagnostic CML samples were compared to 7 healthy donors. We profiled 149,000 BM cells, including 83,686 CD34⁺ hematopoietic progenitors, and applied integrative analyses including gene set enrichment analysis (GSEA), gene set variation analysis (GSVA), differential gene expression testing, and CellChat-based intercellular communication analysis to characterize LSC states, pathway activity, and microenvironmental interactions. In parallel, we profiled 35,525 CD34⁺ cells from the PB to determine whether diagnostic biomarkers identified in the BM compartment are preserved in the peripheral circulation.
Results We compared diagnostic LSCs (2692 BCR::ABL+ and 3031 BCR::ABL1- stem cells) from patients who maintained remission versus those who relapsed following TKI cessation. Gene Set Variation Analysis (GSVA) demonstrated stark transcriptional differences at diagnosis between these groups. LSCs from relapsing patients showed pronounced enrichment of cell cycle progression programs, including E2F targets and DNA replication, and modest enrichments in lipid storage and immune signalling. Intriguingly, a closer examination resolved two molecularly distinct, relapse-fated LSC subtypes at diagnosis: a proliferative subtype (Relapse-P; n = 6) defined by E2F-driven transcriptional programs (DNA replication), metabolic rewiring, and oncogenic signalling; and an immune subtype (Relapse-I; n = 8) characterized by inflammatory signalling, cell adhesion processes, and fatty acid metabolism. Both subtypes exhibited a Short-Term Hematopoietic Stem Cell (ST-HSC)-like state of activated quiescence, marked by CD71 expression. Critically, quiescent Relapse-P LSCs demonstrated a druggable, non-canonical PRC1.1 (Polycomb Repressive Complex) activation, revealing a targetable vulnerability absent in remission cells. Cell-cell communication analysis uncovered self-reinforcing autocrine circuits specifically enriched in relapse LSCs, notably sterol metabolism (DHEA-PPARA/ESR1) and adhesion signalling (NAMPT-ITGA5/ITGB1). Common to both the Relapse-P and Relapse-I subgroups, we identified the cancer testis antigens, SPAG6 and PRSS21 as BM-specific candidate biomarkers. However, their expression was markedly diminished in the Relapse-PB cohort, with only SPAG6 retained at reduced sensitivity. These observations underscore the need for BM-centric biomarker development at the point of CML diagnosis.
ConclusionOur study provides compelling evidence that relapse fate in CML is molecularly pre-encoded within diagnostic LSCs, years before therapy cessation and enables early risk stratification via SPAG6, PRSS21 and CD71 as markers of BM-derived LSCs. This atlas provides a valuable resource for dissecting LSC-intrinsic programs and immune–LSC interactions, while also enabling two key clinical applications: (1) development of early biomarkers based on BM-derived LSC signatures, and (2) rational design of combination therapies (e.g., TKI + epigenetic drugs) to improve curative outcomes in CML.